A Novel Comprehensive Approach for Estimating Concept Semantic Similarity in WordNet

نویسندگان

  • Xiao-gang Zhang
  • Shou-qian Sun
  • Ke-jun Zhang
چکیده

Computation of semantic similarity between concepts is an important foundation for many research works. This paper focuses on IC (information content) computing methods and IC measures, which estimate the semantic similarities between concepts by exploiting the topological parameters of the taxonomy. Based on analyzing representative IC computing methods and typical semantic similarity measures, we propose a new hybrid IC computing method. Through adopting the parameter dhyp and lch, we utilize the new IC computing method and propose a novel comprehensive measure of semantic similarity between concepts. An experiment based on WordNet “is a” taxonomy has been designed to test representative measures and our measure on benchmark dataset R&G, and the results show that our measure can obviously improve the similarity accuracy. We evaluate the proposed approach by comparing the correlation coefficients between five measures (the proposed approach, four other similarity methods) and the artificial data. The results show that our proposal outperforms the previous measures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of Persian ICT WordNet using Princeton WordNet

WordNet is a large lexical database of English language, in which, nouns, verbs, adjectives, and adverbs are grouped into sets of cognitive synonyms (synsets). Each synset expresses a distinct concept. Synsets are interlinked by both semantic and lexical relations. WordNet is essentially used for word sense disambiguation, information retrieval, and text translation. In this paper, we propose s...

متن کامل

Query Architecture Expansion in Web Using Fuzzy Multi Domain Ontology

Due to the increasing web, there are many challenges to establish a general framework for data mining and retrieving structured data from the Web. Creating an ontology is a step towards solving this problem. The ontology raises the main entity and the concept of any data in data mining. In this paper, we tried to propose a method for applying the "meaning" of the search system, But the problem ...

متن کامل

A semantic relatedness metric based on free link structure

While shortest paths in WordNet are known to correlate well with semantic similarity, an is-a hierarchy is less suited for estimating semantic relatedness. We demonstrate this by comparing two free scale networks ( ConceptNet and Wikipedia) to WordNet. Using the Finkelstein353 dataset we show that a shortest path metric run on Wikipedia attains a better correlation than WordNet-based metrics. C...

متن کامل

A semantic relatedness metric based on free link structure (short paper)

While shortest paths in WordNet are known to correlate well with semantic similarity, an is-a hierarchy is less suited for estimating semantic relatedness. We demonstrate this by comparing two free scale networks ( ConceptNet and Wikipedia) to WordNet. Using the Finkelstein353 dataset we show that a shortest path metric run on Wikipedia attains a better correlation than WordNet-based metrics. C...

متن کامل

A New Model of Information Content Based on Concept’s Topology for Measuring Semantic Similarity in WordNet

Information content plays an important role in measuring semantic similarity of concepts. The conventional way of IC obtained is through statistical analysis of corpora. Recently corpora–independent model has attracted great concern in this area. This paper analyzes the state-of-art IC models, highlights important related issues, and presents a novel IC model based on concepts’ topology in Word...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1703.01726  شماره 

صفحات  -

تاریخ انتشار 2017